Overview
Dataset statistics
| Number of variables | 19 |
|---|---|
| Number of observations | 1296675 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 468.5 MiB |
| Average record size in memory | 378.9 B |
Variable types
| DateTime | 1 |
|---|---|
| Numeric | 12 |
| Text | 2 |
| Categorical | 4 |
lat is highly overall correlated with merch_lat | High correlation |
long is highly overall correlated with merch_long | High correlation |
merch_lat is highly overall correlated with lat | High correlation |
merch_long is highly overall correlated with long | High correlation |
month is highly overall correlated with year | High correlation |
year is highly overall correlated with month | High correlation |
is_fraud is highly imbalanced (94.9%) | Imbalance |
amt is highly skewed (γ1 = 42.27787379) | Skewed |
hour has 42502 (3.3%) zeros | Zeros |
dayofweek has 254282 (19.6%) zeros | Zeros |
Reproduction
| Analysis started | 2025-12-02 02:46:14.039195 |
|---|---|
| Analysis finished | 2025-12-02 02:47:59.786773 |
| Duration | 1 minute and 45.75 seconds |
| Software version | ydata-profiling vv4.18.0 |
| Download configuration | config.json |
Variables
| Distinct | 1274791 |
|---|---|
| Distinct (%) | 98.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 9.9 MiB |
| Minimum | 2019-01-01 00:00:18 |
|---|---|
| Maximum | 2020-06-21 12:13:37 |
| Invalid dates | 0 |
| Invalid dates (%) | 0.0% |
cc_num
Real number (ℝ)
| Distinct | 983 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.1719204 × 1017 |
| Minimum | 6.0416207 × 1010 |
|---|---|
| Maximum | 4.9923464 × 1018 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 9.9 MiB |
Quantile statistics
| Minimum | 6.0416207 × 1010 |
|---|---|
| 5-th percentile | 6.3048488 × 1011 |
| Q1 | 1.8004295 × 1014 |
| median | 3.5214173 × 1015 |
| Q3 | 4.6422555 × 1015 |
| 95-th percentile | 4.497914 × 1018 |
| Maximum | 4.9923464 × 1018 |
| Range | 4.9923463 × 1018 |
| Interquartile range (IQR) | 4.4622125 × 1015 |
Descriptive statistics
| Standard deviation | 1.3088064 × 1018 |
|---|---|
| Coefficient of variation (CV) | 3.1371798 |
| Kurtosis | 6.1799499 |
| Mean | 4.1719204 × 1017 |
| Median Absolute Deviation (MAD) | 3.0764709 × 1015 |
| Skewness | 2.851879 |
| Sum | -6.7255419 × 1018 |
| Variance | 1.7129743 × 1036 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4.512828415 × 1018 | 3123 | 0.2% |
| 5.713652351 × 1011 | 3123 | 0.2% |
| 3.672269902 × 1013 | 3119 | 0.2% |
| 2.131124026 × 1014 | 3117 | 0.2% |
| 3.54510934 × 1015 | 3113 | 0.2% |
| 6.534628261 × 1015 | 3112 | 0.2% |
| 6.011367958 × 1015 | 3110 | 0.2% |
| 2.720433096 × 1015 | 3107 | 0.2% |
| 6.011438889 × 1015 | 3106 | 0.2% |
| 6.011109737 × 1015 | 3101 | 0.2% |
| Other values (973) | 1265544 |
| Value | Count | Frequency (%) |
| 6.041620718 × 1010 | 1518 | |
| 6.042292873 × 1010 | 1531 | |
| 6.042309813 × 1010 | 510 | < 0.1% |
| 6.042785159 × 1010 | 528 | < 0.1% |
| 6.048700208 × 1010 | 496 | < 0.1% |
| 6.04905963 × 1010 | 1010 | |
| 6.049559311 × 1010 | 518 | < 0.1% |
| 5.018029536 × 1011 | 1559 | |
| 5.018181333 × 1011 | 8 | < 0.1% |
| 5.018282048 × 1011 | 515 | < 0.1% |
| Value | Count | Frequency (%) |
| 4.992346398 × 1018 | 2059 | |
| 4.989847571 × 1018 | 1007 | 0.1% |
| 4.980323468 × 1018 | 532 | < 0.1% |
| 4.973530368 × 1018 | 1040 | |
| 4.958589672 × 1018 | 1476 | |
| 4.95682899 × 1018 | 2566 | |
| 4.911818931 × 1018 | 9 | < 0.1% |
| 4.906628656 × 1018 | 2584 | |
| 4.897067971 × 1018 | 1038 | |
| 4.890424427 × 1018 | 1496 |
merchant
Text
| Distinct | 693 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 99.1 MiB |
Length
| Max length | 43 |
|---|---|
| Median length | 36 |
| Mean length | 23.132597 |
| Min length | 13 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | fraud_Rippin, Kub and Mann |
|---|---|
| 2nd row | fraud_Heller, Gutmann and Zieme |
| 3rd row | fraud_Lind-Buckridge |
| 4th row | fraud_Kutch, Hermiston and Farrell |
| 5th row | fraud_Keeling-Crist |
| Value | Count | Frequency (%) |
| and | 474111 | 15.7% |
| llc | 97780 | 3.2% |
| inc | 91939 | 3.0% |
| sons | 73145 | 2.4% |
| ltd | 70853 | 2.3% |
| plc | 66475 | 2.2% |
| group | 50447 | 1.7% |
| fraud_kutch | 10560 | 0.3% |
| fraud_schaefer | 9394 | 0.3% |
| fraud_streich | 9250 | 0.3% |
| Other values (804) | 2069403 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 2910697 | 9.7% |
| r | 2695758 | 9.0% |
| d | 2139780 | 7.1% |
| e | 1865710 | 6.2% |
| u | 1857912 | 6.2% |
| n | 1768848 | 5.9% |
| 1726682 | 5.8% | |
| f | 1397378 | 4.7% |
| _ | 1296675 | 4.3% |
| o | 1129340 | 3.8% |
| Other values (45) | 11206680 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 29995460 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| a | 2910697 | 9.7% |
| r | 2695758 | 9.0% |
| d | 2139780 | 7.1% |
| e | 1865710 | 6.2% |
| u | 1857912 | 6.2% |
| n | 1768848 | 5.9% |
| 1726682 | 5.8% | |
| f | 1397378 | 4.7% |
| _ | 1296675 | 4.3% |
| o | 1129340 | 3.8% |
| Other values (45) | 11206680 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 29995460 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| a | 2910697 | 9.7% |
| r | 2695758 | 9.0% |
| d | 2139780 | 7.1% |
| e | 1865710 | 6.2% |
| u | 1857912 | 6.2% |
| n | 1768848 | 5.9% |
| 1726682 | 5.8% | |
| f | 1397378 | 4.7% |
| _ | 1296675 | 4.3% |
| o | 1129340 | 3.8% |
| Other values (45) | 11206680 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 29995460 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| a | 2910697 | 9.7% |
| r | 2695758 | 9.0% |
| d | 2139780 | 7.1% |
| e | 1865710 | 6.2% |
| u | 1857912 | 6.2% |
| n | 1768848 | 5.9% |
| 1726682 | 5.8% | |
| f | 1397378 | 4.7% |
| _ | 1296675 | 4.3% |
| o | 1129340 | 3.8% |
| Other values (45) | 11206680 |
category
Categorical
| Distinct | 14 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 83.5 MiB |
| gas_transport | |
|---|---|
| grocery_pos | |
| home | |
| shopping_pos | |
| kids_pets | |
| Other values (9) |
Length
| Max length | 14 |
|---|---|
| Median length | 12 |
| Mean length | 10.526079 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | misc_net |
|---|---|
| 2nd row | grocery_pos |
| 3rd row | entertainment |
| 4th row | gas_transport |
| 5th row | misc_pos |
Common Values
| Value | Count | Frequency (%) |
| gas_transport | 131659 | |
| grocery_pos | 123638 | |
| home | 123115 | |
| shopping_pos | 116672 | |
| kids_pets | 113035 | |
| shopping_net | 97543 | |
| entertainment | 94014 | |
| food_dining | 91461 | 7.1% |
| personal_care | 90758 | 7.0% |
| health_fitness | 85879 | 6.6% |
| Other values (4) | 228901 |
Length
| Value | Count | Frequency (%) |
| gas_transport | 131659 | |
| grocery_pos | 123638 | |
| home | 123115 | |
| shopping_pos | 116672 | |
| kids_pets | 113035 | |
| shopping_net | 97543 | |
| entertainment | 94014 | |
| food_dining | 91461 | 7.1% |
| personal_care | 90758 | 7.0% |
| health_fitness | 85879 | 6.6% |
| Other values (4) | 228901 |
Most occurring characters
| Value | Count | Frequency (%) |
| s | 1429026 | |
| e | 1287345 | |
| o | 1231724 | |
| n | 1193757 | |
| p | 1083847 | 7.9% |
| t | 1076942 | 7.9% |
| _ | 1039039 | 7.6% |
| r | 917535 | 6.7% |
| i | 833007 | 6.1% |
| a | 665234 | 4.9% |
| Other values (10) | 2891447 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 13648903 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| s | 1429026 | |
| e | 1287345 | |
| o | 1231724 | |
| n | 1193757 | |
| p | 1083847 | 7.9% |
| t | 1076942 | 7.9% |
| _ | 1039039 | 7.6% |
| r | 917535 | 6.7% |
| i | 833007 | 6.1% |
| a | 665234 | 4.9% |
| Other values (10) | 2891447 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 13648903 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| s | 1429026 | |
| e | 1287345 | |
| o | 1231724 | |
| n | 1193757 | |
| p | 1083847 | 7.9% |
| t | 1076942 | 7.9% |
| _ | 1039039 | 7.6% |
| r | 917535 | 6.7% |
| i | 833007 | 6.1% |
| a | 665234 | 4.9% |
| Other values (10) | 2891447 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 13648903 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| s | 1429026 | |
| e | 1287345 | |
| o | 1231724 | |
| n | 1193757 | |
| p | 1083847 | 7.9% |
| t | 1076942 | 7.9% |
| _ | 1039039 | 7.6% |
| r | 917535 | 6.7% |
| i | 833007 | 6.1% |
| a | 665234 | 4.9% |
| Other values (10) | 2891447 |
amt
Real number (ℝ)
Skewed
| Distinct | 52928 |
|---|---|
| Distinct (%) | 4.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 70.351035 |
| Minimum | 1 |
|---|---|
| Maximum | 28948.9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 9.9 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2.44 |
| Q1 | 9.65 |
| median | 47.52 |
| Q3 | 83.14 |
| 95-th percentile | 196.31 |
| Maximum | 28948.9 |
| Range | 28947.9 |
| Interquartile range (IQR) | 73.49 |
Descriptive statistics
| Standard deviation | 160.31604 |
|---|---|
| Coefficient of variation (CV) | 2.2788014 |
| Kurtosis | 4545.645 |
| Mean | 70.351035 |
| Median Absolute Deviation (MAD) | 37.5 |
| Skewness | 42.277874 |
| Sum | 91222429 |
| Variance | 25701.232 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1.14 | 542 | < 0.1% |
| 1.04 | 538 | < 0.1% |
| 1.25 | 535 | < 0.1% |
| 1.02 | 533 | < 0.1% |
| 1.01 | 523 | < 0.1% |
| 1.05 | 519 | < 0.1% |
| 1.2 | 516 | < 0.1% |
| 1.23 | 515 | < 0.1% |
| 1.08 | 512 | < 0.1% |
| 1.11 | 509 | < 0.1% |
| Other values (52918) | 1291433 |
| Value | Count | Frequency (%) |
| 1 | 222 | |
| 1.01 | 523 | |
| 1.02 | 533 | |
| 1.03 | 499 | |
| 1.04 | 538 | |
| 1.05 | 519 | |
| 1.06 | 471 | |
| 1.07 | 498 | |
| 1.08 | 512 | |
| 1.09 | 496 |
| Value | Count | Frequency (%) |
| 28948.9 | 1 | |
| 27390.12 | 1 | |
| 27119.77 | 1 | |
| 26544.12 | 1 | |
| 25086.94 | 1 | |
| 17897.24 | 1 | |
| 15305.95 | 1 | |
| 15047.03 | 1 | |
| 15034.18 | 1 | |
| 14849.74 | 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | F |
|---|---|
| 2nd row | F |
| 3rd row | M |
| 4th row | M |
| 5th row | M |
Common Values
| Value | Count | Frequency (%) |
| F | 709863 | |
| M | 586812 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| f | 709863 | |
| m | 586812 |
Most occurring characters
| Value | Count | Frequency (%) |
| F | 709863 | |
| M | 586812 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1296675 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| F | 709863 | |
| M | 586812 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1296675 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| F | 709863 | |
| M | 586812 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1296675 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| F | 709863 | |
| M | 586812 |
lat
Real number (ℝ)
High correlation
| Distinct | 968 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 38.537622 |
| Minimum | 20.0271 |
|---|---|
| Maximum | 66.6933 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 9.9 MiB |
Quantile statistics
| Minimum | 20.0271 |
|---|---|
| 5-th percentile | 29.8826 |
| Q1 | 34.6205 |
| median | 39.3543 |
| Q3 | 41.9404 |
| 95-th percentile | 45.8433 |
| Maximum | 66.6933 |
| Range | 46.6662 |
| Interquartile range (IQR) | 7.3199 |
Descriptive statistics
| Standard deviation | 5.0758084 |
|---|---|
| Coefficient of variation (CV) | 0.13171047 |
| Kurtosis | 0.81296795 |
| Mean | 38.537622 |
| Median Absolute Deviation (MAD) | 3.3597 |
| Skewness | -0.18602768 |
| Sum | 49970771 |
| Variance | 25.763831 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 36.385 | 3646 | 0.3% |
| 26.1184 | 3613 | 0.3% |
| 42.5164 | 3597 | 0.3% |
| 43.0048 | 3527 | 0.3% |
| 44.5995 | 3123 | 0.2% |
| 39.8936 | 3123 | 0.2% |
| 33.2887 | 3119 | 0.2% |
| 34.0326 | 3117 | 0.2% |
| 33.4783 | 3113 | 0.2% |
| 44.3346 | 3112 | 0.2% |
| Other values (958) | 1263585 |
| Value | Count | Frequency (%) |
| 20.0271 | 1527 | |
| 20.0827 | 1032 | 0.1% |
| 24.6557 | 2584 | |
| 26.1184 | 3613 | |
| 26.3304 | 542 | < 0.1% |
| 26.3771 | 518 | < 0.1% |
| 26.4215 | 3038 | |
| 26.4722 | 2524 | |
| 26.529 | 1549 | |
| 26.6939 | 1027 | 0.1% |
| Value | Count | Frequency (%) |
| 66.6933 | 12 | < 0.1% |
| 65.6899 | 540 | < 0.1% |
| 64.7556 | 1568 | |
| 48.8878 | 3030 | |
| 48.8856 | 2066 | |
| 48.8328 | 1533 | |
| 48.6669 | 1047 | 0.1% |
| 48.6031 | 2973 | |
| 48.4786 | 2038 | |
| 48.34 | 3088 |
long
Real number (ℝ)
High correlation
| Distinct | 969 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -90.226335 |
| Minimum | -165.6723 |
|---|---|
| Maximum | -67.9503 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 1296675 |
| Negative (%) | 100.0% |
| Memory size | 9.9 MiB |
Quantile statistics
| Minimum | -165.6723 |
|---|---|
| 5-th percentile | -119.0825 |
| Q1 | -96.798 |
| median | -87.4769 |
| Q3 | -80.158 |
| 95-th percentile | -73.5112 |
| Maximum | -67.9503 |
| Range | 97.722 |
| Interquartile range (IQR) | 16.64 |
Descriptive statistics
| Standard deviation | 13.759077 |
|---|---|
| Coefficient of variation (CV) | -0.15249513 |
| Kurtosis | 1.8558923 |
| Mean | -90.226335 |
| Median Absolute Deviation (MAD) | 8.1527 |
| Skewness | -1.1501077 |
| Sum | -1.1699423 × 108 |
| Variance | 189.3122 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| -98.0727 | 3646 | 0.3% |
| -81.7361 | 3613 | 0.3% |
| -82.9832 | 3597 | 0.3% |
| -108.8964 | 3527 | 0.3% |
| -79.7856 | 3123 | 0.2% |
| -86.2141 | 3123 | 0.2% |
| -111.0985 | 3119 | 0.2% |
| -82.2027 | 3117 | 0.2% |
| -90.5142 | 3113 | 0.2% |
| -73.098 | 3112 | 0.2% |
| Other values (959) | 1263585 |
| Value | Count | Frequency (%) |
| -165.6723 | 1568 | |
| -156.292 | 540 | < 0.1% |
| -155.488 | 1032 | |
| -155.3697 | 1527 | |
| -153.994 | 12 | < 0.1% |
| -124.4409 | 1043 | |
| -124.2174 | 1547 | |
| -124.1587 | 1031 | |
| -124.1437 | 1526 | |
| -123.9743 | 2036 |
| Value | Count | Frequency (%) |
| -67.9503 | 2080 | |
| -68.5565 | 1014 | 0.1% |
| -69.2675 | 519 | < 0.1% |
| -69.4828 | 2050 | |
| -69.9576 | 537 | < 0.1% |
| -69.9656 | 3107 | |
| -70.1031 | 9 | < 0.1% |
| -70.239 | 1036 | 0.1% |
| -70.3001 | 2090 | |
| -70.3457 | 1527 |
city_pop
Real number (ℝ)
| Distinct | 879 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 88824.441 |
| Minimum | 23 |
|---|---|
| Maximum | 2906700 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 9.9 MiB |
Quantile statistics
| Minimum | 23 |
|---|---|
| 5-th percentile | 139 |
| Q1 | 743 |
| median | 2456 |
| Q3 | 20328 |
| 95-th percentile | 525713 |
| Maximum | 2906700 |
| Range | 2906677 |
| Interquartile range (IQR) | 19585 |
Descriptive statistics
| Standard deviation | 301956.36 |
|---|---|
| Coefficient of variation (CV) | 3.3994738 |
| Kurtosis | 37.614519 |
| Mean | 88824.441 |
| Median Absolute Deviation (MAD) | 2198 |
| Skewness | 5.5938531 |
| Sum | 1.1517643 × 1011 |
| Variance | 9.1177644 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 606 | 5496 | 0.4% |
| 1595797 | 5130 | 0.4% |
| 1312922 | 5075 | 0.4% |
| 1766 | 4574 | 0.4% |
| 241 | 4533 | 0.3% |
| 2906700 | 4168 | 0.3% |
| 276002 | 4155 | 0.3% |
| 302 | 4147 | 0.3% |
| 910148 | 4073 | 0.3% |
| 198 | 4067 | 0.3% |
| Other values (869) | 1251257 |
| Value | Count | Frequency (%) |
| 23 | 2049 | |
| 37 | 1013 | 0.1% |
| 43 | 2034 | |
| 46 | 3040 | |
| 47 | 511 | < 0.1% |
| 49 | 1054 | 0.1% |
| 51 | 1016 | 0.1% |
| 52 | 518 | < 0.1% |
| 53 | 2610 | |
| 60 | 1045 | 0.1% |
| Value | Count | Frequency (%) |
| 2906700 | 4168 | |
| 2504700 | 2033 | 0.2% |
| 2383912 | 521 | < 0.1% |
| 1595797 | 5130 | |
| 1577385 | 2563 | |
| 1526206 | 3517 | |
| 1417793 | 8 | < 0.1% |
| 1382480 | 2056 | |
| 1312922 | 5075 | |
| 1263321 | 3629 |
job
Text
| Distinct | 494 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 95.5 MiB |
Length
| Max length | 59 |
|---|---|
| Median length | 38 |
| Mean length | 20.227102 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Psychologist, counselling |
|---|---|
| 2nd row | Special educational needs teacher |
| 3rd row | Nature conservation officer |
| 4th row | Patent attorney |
| 5th row | Dance movement psychotherapist |
| Value | Count | Frequency (%) |
| engineer | 131756 | 4.6% |
| officer | 110915 | 3.9% |
| manager | 61124 | 2.1% |
| scientist | 55878 | 1.9% |
| designer | 52218 | 1.8% |
| surveyor | 49062 | 1.7% |
| teacher | 38126 | 1.3% |
| psychologist | 32600 | 1.1% |
| research | 29754 | 1.0% |
| editor | 28725 | 1.0% |
| Other values (456) | 2289024 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 2803032 | 10.7% |
| i | 2386346 | 9.1% |
| r | 2198669 | 8.4% |
| a | 1813638 | 6.9% |
| t | 1782302 | 6.8% |
| n | 1764769 | 6.7% |
| 1582507 | 6.0% | |
| o | 1491775 | 5.7% |
| s | 1444701 | 5.5% |
| c | 1323152 | 5.0% |
| Other values (43) | 7637087 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 26227978 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 2803032 | 10.7% |
| i | 2386346 | 9.1% |
| r | 2198669 | 8.4% |
| a | 1813638 | 6.9% |
| t | 1782302 | 6.8% |
| n | 1764769 | 6.7% |
| 1582507 | 6.0% | |
| o | 1491775 | 5.7% |
| s | 1444701 | 5.5% |
| c | 1323152 | 5.0% |
| Other values (43) | 7637087 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 26227978 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 2803032 | 10.7% |
| i | 2386346 | 9.1% |
| r | 2198669 | 8.4% |
| a | 1813638 | 6.9% |
| t | 1782302 | 6.8% |
| n | 1764769 | 6.7% |
| 1582507 | 6.0% | |
| o | 1491775 | 5.7% |
| s | 1444701 | 5.5% |
| c | 1323152 | 5.0% |
| Other values (43) | 7637087 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 26227978 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 2803032 | 10.7% |
| i | 2386346 | 9.1% |
| r | 2198669 | 8.4% |
| a | 1813638 | 6.9% |
| t | 1782302 | 6.8% |
| n | 1764769 | 6.7% |
| 1582507 | 6.0% | |
| o | 1491775 | 5.7% |
| s | 1444701 | 5.5% |
| c | 1323152 | 5.0% |
| Other values (43) | 7637087 |
merch_lat
Real number (ℝ)
High correlation
| Distinct | 1247805 |
|---|---|
| Distinct (%) | 96.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 38.537338 |
| Minimum | 19.027785 |
|---|---|
| Maximum | 67.510267 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 9.9 MiB |
Quantile statistics
| Minimum | 19.027785 |
|---|---|
| 5-th percentile | 29.751653 |
| Q1 | 34.733572 |
| median | 39.36568 |
| Q3 | 41.957164 |
| 95-th percentile | 46.00353 |
| Maximum | 67.510267 |
| Range | 48.482482 |
| Interquartile range (IQR) | 7.223592 |
Descriptive statistics
| Standard deviation | 5.1097884 |
|---|---|
| Coefficient of variation (CV) | 0.13259318 |
| Kurtosis | 0.79599391 |
| Mean | 38.537338 |
| Median Absolute Deviation (MAD) | 3.397536 |
| Skewness | -0.18191543 |
| Sum | 49970403 |
| Variance | 26.109937 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 39.983138 | 4 | < 0.1% |
| 41.731663 | 4 | < 0.1% |
| 34.134994 | 4 | < 0.1% |
| 40.456305 | 4 | < 0.1% |
| 42.920584 | 4 | < 0.1% |
| 42.889354 | 4 | < 0.1% |
| 40.557026 | 4 | < 0.1% |
| 37.695715 | 4 | < 0.1% |
| 43.327495 | 4 | < 0.1% |
| 41.271468 | 4 | < 0.1% |
| Other values (1247795) | 1296635 |
| Value | Count | Frequency (%) |
| 19.027785 | 1 | |
| 19.027804 | 1 | |
| 19.029798 | 1 | |
| 19.031242 | 1 | |
| 19.032277 | 1 | |
| 19.033288 | 1 | |
| 19.034282 | 1 | |
| 19.034687 | 1 | |
| 19.035472 | 1 | |
| 19.036312 | 1 |
| Value | Count | Frequency (%) |
| 67.510267 | 1 | |
| 67.441518 | 1 | |
| 67.397018 | 1 | |
| 67.188111 | 1 | |
| 67.064277 | 1 | |
| 66.835174 | 1 | |
| 66.682905 | 1 | |
| 66.67355 | 1 | |
| 66.664673 | 1 | |
| 66.659242 | 1 |
merch_long
Real number (ℝ)
High correlation
| Distinct | 1275745 |
|---|---|
| Distinct (%) | 98.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -90.226465 |
| Minimum | -166.67124 |
|---|---|
| Maximum | -66.950902 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 1296675 |
| Negative (%) | 100.0% |
| Memory size | 9.9 MiB |
Quantile statistics
| Minimum | -166.67124 |
|---|---|
| 5-th percentile | -119.33009 |
| Q1 | -96.897276 |
| median | -87.438392 |
| Q3 | -80.236796 |
| 95-th percentile | -73.354218 |
| Maximum | -66.950902 |
| Range | 99.72034 |
| Interquartile range (IQR) | 16.660479 |
Descriptive statistics
| Standard deviation | 13.771091 |
|---|---|
| Coefficient of variation (CV) | -0.15262806 |
| Kurtosis | 1.8484792 |
| Mean | -90.226465 |
| Median Absolute Deviation (MAD) | 8.227889 |
| Skewness | -1.1469599 |
| Sum | -1.169944 × 108 |
| Variance | 189.64294 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| -74.618269 | 4 | < 0.1% |
| -81.219189 | 4 | < 0.1% |
| -87.116414 | 4 | < 0.1% |
| -88.49309 | 3 | < 0.1% |
| -82.055036 | 3 | < 0.1% |
| -95.73937 | 3 | < 0.1% |
| -94.024818 | 3 | < 0.1% |
| -79.588155 | 3 | < 0.1% |
| -123.154862 | 3 | < 0.1% |
| -82.658224 | 3 | < 0.1% |
| Other values (1275735) | 1296642 |
| Value | Count | Frequency (%) |
| -166.671242 | 1 | |
| -166.670132 | 1 | |
| -166.669638 | 1 | |
| -166.666179 | 1 | |
| -166.664828 | 1 | |
| -166.662888 | 1 | |
| -166.661968 | 1 | |
| -166.659277 | 1 | |
| -166.657834 | 1 | |
| -166.657174 | 1 |
| Value | Count | Frequency (%) |
| -66.950902 | 1 | |
| -66.955996 | 1 | |
| -66.95654 | 1 | |
| -66.958659 | 1 | |
| -66.958751 | 1 | |
| -66.959178 | 1 | |
| -66.961923 | 1 | |
| -66.962913 | 1 | |
| -66.963918 | 1 | |
| -66.963975 | 1 |
is_fraud
Categorical
Imbalance
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 71.7 MiB |
| 0 | |
|---|---|
| 1 | 7506 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 1289169 | |
| 1 | 7506 | 0.6% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 1289169 | |
| 1 | 7506 | 0.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1289169 | |
| 1 | 7506 | 0.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1296675 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 1289169 | |
| 1 | 7506 | 0.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1296675 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 1289169 | |
| 1 | 7506 | 0.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1296675 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 1289169 | |
| 1 | 7506 | 0.6% |
hour
Real number (ℝ)
Zeros
| Distinct | 24 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 12.804858 |
| Minimum | 0 |
|---|---|
| Maximum | 23 |
| Zeros | 42502 |
| Zeros (%) | 3.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.9 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 7 |
| median | 14 |
| Q3 | 19 |
| 95-th percentile | 23 |
| Maximum | 23 |
| Range | 23 |
| Interquartile range (IQR) | 12 |
Descriptive statistics
| Standard deviation | 6.8178239 |
|---|---|
| Coefficient of variation (CV) | 0.53244042 |
| Kurtosis | -1.0795803 |
| Mean | 12.804858 |
| Median Absolute Deviation (MAD) | 5 |
| Skewness | -0.28282545 |
| Sum | 16603739 |
| Variance | 46.482723 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 23 | 67104 | 5.2% |
| 22 | 66982 | 5.2% |
| 18 | 66051 | 5.1% |
| 16 | 65726 | 5.1% |
| 21 | 65533 | 5.1% |
| 19 | 65508 | 5.1% |
| 17 | 65450 | 5.0% |
| 15 | 65391 | 5.0% |
| 13 | 65314 | 5.0% |
| 12 | 65257 | 5.0% |
| Other values (14) | 638359 |
| Value | Count | Frequency (%) |
| 0 | 42502 | |
| 1 | 42869 | |
| 2 | 42656 | |
| 3 | 42769 | |
| 4 | 41863 | |
| 5 | 42171 | |
| 6 | 42300 | |
| 7 | 42203 | |
| 8 | 42505 | |
| 9 | 42185 |
| Value | Count | Frequency (%) |
| 23 | 67104 | |
| 22 | 66982 | |
| 21 | 65533 | |
| 20 | 65098 | |
| 19 | 65508 | |
| 18 | 66051 | |
| 17 | 65450 | |
| 16 | 65726 | |
| 15 | 65391 | |
| 14 | 64885 |
day
Real number (ℝ)
| Distinct | 31 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15.587978 |
| Minimum | 1 |
|---|---|
| Maximum | 31 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.9 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 8 |
| median | 15 |
| Q3 | 23 |
| 95-th percentile | 30 |
| Maximum | 31 |
| Range | 30 |
| Interquartile range (IQR) | 15 |
Descriptive statistics
| Standard deviation | 8.8291214 |
|---|---|
| Coefficient of variation (CV) | 0.5664058 |
| Kurtosis | -1.1871417 |
| Mean | 15.587978 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 0.030847364 |
| Sum | 20212542 |
| Variance | 77.953384 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 47089 | 3.6% |
| 15 | 46213 | 3.6% |
| 8 | 46201 | 3.6% |
| 16 | 44894 | 3.5% |
| 2 | 44748 | 3.5% |
| 9 | 44685 | 3.4% |
| 7 | 44239 | 3.4% |
| 14 | 44015 | 3.4% |
| 28 | 43470 | 3.4% |
| 17 | 42272 | 3.3% |
| Other values (21) | 848849 |
| Value | Count | Frequency (%) |
| 1 | 47089 | |
| 2 | 44748 | |
| 3 | 41842 | |
| 4 | 41479 | |
| 5 | 41886 | |
| 6 | 41420 | |
| 7 | 44239 | |
| 8 | 46201 | |
| 9 | 44685 | |
| 10 | 41934 |
| Value | Count | Frequency (%) |
| 31 | 24701 | |
| 30 | 41019 | |
| 29 | 39617 | |
| 28 | 43470 | |
| 27 | 39684 | |
| 26 | 40692 | |
| 25 | 40374 | |
| 24 | 41360 | |
| 23 | 40815 | |
| 22 | 42061 |
month
Real number (ℝ)
High correlation
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.1421497 |
| Minimum | 1 |
|---|---|
| Maximum | 12 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.9 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 6 |
| Q3 | 9 |
| 95-th percentile | 12 |
| Maximum | 12 |
| Range | 11 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 3.4177033 |
|---|---|
| Coefficient of variation (CV) | 0.55643439 |
| Kurtosis | -1.0475463 |
| Mean | 6.1421497 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 0.29851575 |
| Sum | 7964372 |
| Variance | 11.680696 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 5 | 146875 | |
| 6 | 143811 | |
| 3 | 143789 | |
| 12 | 141060 | |
| 4 | 134970 | |
| 1 | 104727 | |
| 2 | 97657 | |
| 8 | 87359 | |
| 7 | 86596 | |
| 9 | 70652 | |
| Other values (2) | 139179 |
| Value | Count | Frequency (%) |
| 1 | 104727 | |
| 2 | 97657 | |
| 3 | 143789 | |
| 4 | 134970 | |
| 5 | 146875 | |
| 6 | 143811 | |
| 7 | 86596 | |
| 8 | 87359 | |
| 9 | 70652 | |
| 10 | 68758 |
| Value | Count | Frequency (%) |
| 12 | 141060 | |
| 11 | 70421 | |
| 10 | 68758 | |
| 9 | 70652 | |
| 8 | 87359 | |
| 7 | 86596 | |
| 6 | 143811 | |
| 5 | 146875 | |
| 4 | 134970 | |
| 3 | 143789 |
year
Categorical
High correlation
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 75.4 MiB |
| 2019 | |
|---|---|
| 2020 |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2019 |
|---|---|
| 2nd row | 2019 |
| 3rd row | 2019 |
| 4th row | 2019 |
| 5th row | 2019 |
Common Values
| Value | Count | Frequency (%) |
| 2019 | 924850 | |
| 2020 | 371825 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2019 | 924850 | |
| 2020 | 371825 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 1668500 | |
| 0 | 1668500 | |
| 1 | 924850 | |
| 9 | 924850 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 5186700 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 2 | 1668500 | |
| 0 | 1668500 | |
| 1 | 924850 | |
| 9 | 924850 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 5186700 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 2 | 1668500 | |
| 0 | 1668500 | |
| 1 | 924850 | |
| 9 | 924850 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 5186700 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 2 | 1668500 | |
| 0 | 1668500 | |
| 1 | 924850 | |
| 9 | 924850 |
dayofweek
Real number (ℝ)
Zeros
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.0706037 |
| Minimum | 0 |
|---|---|
| Maximum | 6 |
| Zeros | 254282 |
| Zeros (%) | 19.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.9 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 3 |
| Q3 | 5 |
| 95-th percentile | 6 |
| Maximum | 6 |
| Range | 6 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 2.1981526 |
|---|---|
| Coefficient of variation (CV) | 0.71586984 |
| Kurtosis | -1.445049 |
| Mean | 3.0706037 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | -0.078453041 |
| Sum | 3981575 |
| Variance | 4.8318747 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 254282 | |
| 6 | 250579 | |
| 5 | 200957 | |
| 1 | 160227 | |
| 4 | 152272 | |
| 3 | 147285 | |
| 2 | 131073 |
| Value | Count | Frequency (%) |
| 0 | 254282 | |
| 1 | 160227 | |
| 2 | 131073 | |
| 3 | 147285 | |
| 4 | 152272 | |
| 5 | 200957 | |
| 6 | 250579 |
| Value | Count | Frequency (%) |
| 6 | 250579 | |
| 5 | 200957 | |
| 4 | 152272 | |
| 3 | 147285 | |
| 2 | 131073 | |
| 1 | 160227 | |
| 0 | 254282 |
card_holder_age
Real number (ℝ)
| Distinct | 83 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 46.029298 |
| Minimum | 14 |
|---|---|
| Maximum | 96 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.9 MiB |
Quantile statistics
| Minimum | 14 |
|---|---|
| 5-th percentile | 22 |
| Q1 | 33 |
| median | 44 |
| Q3 | 57 |
| 95-th percentile | 80 |
| Maximum | 96 |
| Range | 82 |
| Interquartile range (IQR) | 24 |
Descriptive statistics
| Standard deviation | 17.382373 |
|---|---|
| Coefficient of variation (CV) | 0.37763714 |
| Kurtosis | -0.17600385 |
| Mean | 46.029298 |
| Median Absolute Deviation (MAD) | 12 |
| Skewness | 0.61226204 |
| Sum | 59685040 |
| Variance | 302.14688 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 47 | 41337 | 3.2% |
| 35 | 39331 | 3.0% |
| 34 | 35816 | 2.8% |
| 32 | 35588 | 2.7% |
| 33 | 33430 | 2.6% |
| 45 | 33098 | 2.6% |
| 48 | 32719 | 2.5% |
| 46 | 32212 | 2.5% |
| 44 | 31035 | 2.4% |
| 43 | 30528 | 2.4% |
| Other values (73) | 951581 |
| Value | Count | Frequency (%) |
| 14 | 1318 | 0.1% |
| 15 | 5817 | 0.4% |
| 16 | 5104 | 0.4% |
| 17 | 1191 | 0.1% |
| 18 | 3901 | 0.3% |
| 19 | 8203 | 0.6% |
| 20 | 16326 | |
| 21 | 14915 | |
| 22 | 24536 | |
| 23 | 13209 |
| Value | Count | Frequency (%) |
| 96 | 138 | < 0.1% |
| 95 | 398 | < 0.1% |
| 94 | 1722 | 0.1% |
| 93 | 5684 | |
| 92 | 4450 | |
| 91 | 4824 | |
| 90 | 5443 | |
| 89 | 3916 | |
| 88 | 3843 | |
| 87 | 2364 |
Interactions
Correlations
| amt | card_holder_age | category | cc_num | city_pop | day | dayofweek | gender | hour | is_fraud | lat | long | merch_lat | merch_long | month | year | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| amt | 1.000 | -0.024 | 0.020 | -0.001 | -0.024 | 0.000 | -0.001 | 0.000 | -0.154 | 0.000 | 0.012 | -0.000 | 0.012 | 0.000 | -0.003 | 0.000 |
| card_holder_age | -0.024 | 1.000 | 0.046 | -0.037 | -0.157 | -0.001 | -0.014 | 0.110 | -0.173 | 0.018 | 0.037 | -0.020 | 0.036 | -0.020 | -0.010 | 0.041 |
| category | 0.020 | 0.046 | 1.000 | 0.009 | 0.014 | 0.001 | 0.003 | 0.054 | 0.271 | 0.071 | 0.011 | 0.009 | 0.011 | 0.009 | 0.001 | 0.000 |
| cc_num | -0.001 | -0.037 | 0.009 | 1.000 | 0.049 | -0.000 | -0.001 | 0.051 | 0.011 | 0.006 | -0.004 | -0.013 | -0.004 | -0.013 | 0.001 | 0.000 |
| city_pop | -0.024 | -0.157 | 0.014 | 0.049 | 1.000 | -0.001 | 0.002 | 0.089 | 0.033 | 0.004 | -0.265 | 0.087 | -0.264 | 0.086 | 0.001 | 0.001 |
| day | 0.000 | -0.001 | 0.001 | -0.000 | -0.001 | 1.000 | 0.017 | 0.000 | -0.000 | 0.009 | -0.000 | 0.000 | -0.000 | 0.000 | 0.008 | 0.057 |
| dayofweek | -0.001 | -0.014 | 0.003 | -0.001 | 0.002 | 0.017 | 1.000 | 0.006 | 0.000 | 0.012 | 0.001 | 0.001 | 0.000 | 0.001 | 0.038 | 0.090 |
| gender | 0.000 | 0.110 | 0.054 | 0.051 | 0.089 | 0.000 | 0.006 | 1.000 | 0.045 | 0.008 | 0.101 | 0.091 | 0.103 | 0.082 | 0.002 | 0.000 |
| hour | -0.154 | -0.173 | 0.271 | 0.011 | 0.033 | -0.000 | 0.000 | 0.045 | 1.000 | 0.095 | -0.011 | -0.006 | -0.010 | -0.006 | -0.001 | 0.001 |
| is_fraud | 0.000 | 0.018 | 0.071 | 0.006 | 0.004 | 0.009 | 0.012 | 0.008 | 0.095 | 1.000 | 0.008 | 0.006 | 0.008 | 0.005 | 0.018 | 0.003 |
| lat | 0.012 | 0.037 | 0.011 | -0.004 | -0.265 | -0.000 | 0.001 | 0.101 | -0.011 | 0.008 | 1.000 | 0.106 | 0.991 | 0.105 | -0.001 | 0.002 |
| long | -0.000 | -0.020 | 0.009 | -0.013 | 0.087 | 0.000 | 0.001 | 0.091 | -0.006 | 0.006 | 0.106 | 1.000 | 0.106 | 0.998 | -0.001 | 0.000 |
| merch_lat | 0.012 | 0.036 | 0.011 | -0.004 | -0.264 | -0.000 | 0.000 | 0.103 | -0.010 | 0.008 | 0.991 | 0.106 | 1.000 | 0.104 | -0.001 | 0.000 |
| merch_long | 0.000 | -0.020 | 0.009 | -0.013 | 0.086 | 0.000 | 0.001 | 0.082 | -0.006 | 0.005 | 0.105 | 0.998 | 0.104 | 1.000 | -0.001 | 0.000 |
| month | -0.003 | -0.010 | 0.001 | 0.001 | 0.001 | 0.008 | 0.038 | 0.002 | -0.001 | 0.018 | -0.001 | -0.001 | -0.001 | -0.001 | 1.000 | 0.527 |
| year | 0.000 | 0.041 | 0.000 | 0.000 | 0.001 | 0.057 | 0.090 | 0.000 | 0.001 | 0.003 | 0.002 | 0.000 | 0.000 | 0.000 | 0.527 | 1.000 |
Missing values
Sample
| trans_date_trans_time | cc_num | merchant | category | amt | gender | lat | long | city_pop | job | merch_lat | merch_long | is_fraud | hour | day | month | year | dayofweek | card_holder_age | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2019-01-01 00:00:18 | 2703186189652095 | fraud_Rippin, Kub and Mann | misc_net | 4.97 | F | 36.0788 | -81.1781 | 3495 | Psychologist, counselling | 36.011293 | -82.048315 | 0 | 0 | 1 | 1 | 2019 | 1 | 31 |
| 1 | 2019-01-01 00:00:44 | 630423337322 | fraud_Heller, Gutmann and Zieme | grocery_pos | 107.23 | F | 48.8878 | -118.2105 | 149 | Special educational needs teacher | 49.159047 | -118.186462 | 0 | 0 | 1 | 1 | 2019 | 1 | 41 |
| 2 | 2019-01-01 00:00:51 | 38859492057661 | fraud_Lind-Buckridge | entertainment | 220.11 | M | 42.1808 | -112.2620 | 4154 | Nature conservation officer | 43.150704 | -112.154481 | 0 | 0 | 1 | 1 | 2019 | 1 | 57 |
| 3 | 2019-01-01 00:01:16 | 3534093764340240 | fraud_Kutch, Hermiston and Farrell | gas_transport | 45.00 | M | 46.2306 | -112.1138 | 1939 | Patent attorney | 47.034331 | -112.561071 | 0 | 0 | 1 | 1 | 2019 | 1 | 52 |
| 4 | 2019-01-01 00:03:06 | 375534208663984 | fraud_Keeling-Crist | misc_pos | 41.96 | M | 38.4207 | -79.4629 | 99 | Dance movement psychotherapist | 38.674999 | -78.632459 | 0 | 0 | 1 | 1 | 2019 | 1 | 33 |
| 5 | 2019-01-01 00:04:08 | 4767265376804500 | fraud_Stroman, Hudson and Erdman | gas_transport | 94.63 | F | 40.3750 | -75.2045 | 2158 | Transport planner | 40.653382 | -76.152667 | 0 | 0 | 1 | 1 | 2019 | 1 | 58 |
| 6 | 2019-01-01 00:04:42 | 30074693890476 | fraud_Rowe-Vandervort | grocery_net | 44.54 | F | 37.9931 | -100.9893 | 2691 | Arboriculturist | 37.162705 | -100.153370 | 0 | 0 | 1 | 1 | 2019 | 1 | 26 |
| 7 | 2019-01-01 00:05:08 | 6011360759745864 | fraud_Corwin-Collins | gas_transport | 71.65 | M | 38.8432 | -78.6003 | 6018 | Designer, multimedia | 38.948089 | -78.540296 | 0 | 0 | 1 | 1 | 2019 | 1 | 72 |
| 8 | 2019-01-01 00:05:18 | 4922710831011201 | fraud_Herzog Ltd | misc_pos | 4.27 | F | 40.3359 | -79.6607 | 1472 | Public affairs consultant | 40.351813 | -79.958146 | 0 | 0 | 1 | 1 | 2019 | 1 | 78 |
| 9 | 2019-01-01 00:06:01 | 2720830304681674 | fraud_Schoen, Kuphal and Nitzsche | grocery_pos | 198.39 | F | 36.5220 | -87.3490 | 151785 | Pathologist | 37.179198 | -87.485381 | 0 | 0 | 1 | 1 | 2019 | 1 | 45 |
| trans_date_trans_time | cc_num | merchant | category | amt | gender | lat | long | city_pop | job | merch_lat | merch_long | is_fraud | hour | day | month | year | dayofweek | card_holder_age | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1296665 | 2020-06-21 12:08:42 | 213193596103206 | fraud_Gulgowski LLC | home | 72.17 | M | 45.7549 | -84.4470 | 95 | Electrical engineer | 44.938461 | -83.996234 | 0 | 12 | 21 | 6 | 2020 | 6 | 26 |
| 1296666 | 2020-06-21 12:09:22 | 4587657402165341815 | fraud_Hyatt, Russel and Gleichner | health_fitness | 7.30 | F | 41.0646 | -87.5917 | 2135 | Psychotherapist, child | 40.556811 | -88.092339 | 0 | 12 | 21 | 6 | 2020 | 6 | 16 |
| 1296667 | 2020-06-21 12:10:56 | 4822367783500458 | fraud_Hahn, Douglas and Schowalter | travel | 19.71 | M | 28.0758 | -81.5929 | 33804 | Exercise physiologist | 27.465871 | -81.511804 | 0 | 12 | 21 | 6 | 2020 | 6 | 29 |
| 1296668 | 2020-06-21 12:11:23 | 213141712584544 | fraud_Metz, Russel and Metz | kids_pets | 100.85 | F | 32.1530 | -90.1217 | 19685 | Fine artist | 31.377697 | -90.528450 | 0 | 12 | 21 | 6 | 2020 | 6 | 36 |
| 1296669 | 2020-06-21 12:11:36 | 4400011257587661852 | fraud_Stiedemann Inc | misc_pos | 37.38 | F | 41.4972 | -98.7858 | 509 | Nurse, children's | 41.728638 | -99.039660 | 0 | 12 | 21 | 6 | 2020 | 6 | 40 |
| 1296670 | 2020-06-21 12:12:08 | 30263540414123 | fraud_Reichel Inc | entertainment | 15.56 | M | 37.7175 | -112.4777 | 258 | Geoscientist | 36.841266 | -111.690765 | 0 | 12 | 21 | 6 | 2020 | 6 | 59 |
| 1296671 | 2020-06-21 12:12:19 | 6011149206456997 | fraud_Abernathy and Sons | food_dining | 51.70 | M | 39.2667 | -77.5101 | 100 | Production assistant, television | 38.906881 | -78.246528 | 0 | 12 | 21 | 6 | 2020 | 6 | 41 |
| 1296672 | 2020-06-21 12:12:32 | 3514865930894695 | fraud_Stiedemann Ltd | food_dining | 105.93 | M | 32.9396 | -105.8189 | 899 | Naval architect | 33.619513 | -105.130529 | 0 | 12 | 21 | 6 | 2020 | 6 | 53 |
| 1296673 | 2020-06-21 12:13:36 | 2720012583106919 | fraud_Reinger, Weissnat and Strosin | food_dining | 74.90 | M | 43.3526 | -102.5411 | 1126 | Volunteer coordinator | 42.788940 | -103.241160 | 0 | 12 | 21 | 6 | 2020 | 6 | 40 |
| 1296674 | 2020-06-21 12:13:37 | 4292902571056973207 | fraud_Langosh, Wintheiser and Hyatt | food_dining | 4.30 | M | 45.8433 | -113.8748 | 218 | Therapist, horticultural | 46.565983 | -114.186110 | 0 | 12 | 21 | 6 | 2020 | 6 | 25 |